Measuring Image Distances via Embedding in a Semantic Manifold
نویسندگان
چکیده
In this work we introduce novel image metrics that can be used with distance-based classifiers or directly to decide whether two input images belong to the same class. While most prior image distances rely purely on comparisons of low-level features extracted from the inputs, our metrics use a large database of labeled photos as auxiliary data to draw semantic relationships between the two images, beyond those computable from simple visual features. In a preprocessing stage our approach derives a semantic image graph from the labeled dataset, where the nodes are the labeled images and the edges connect pictures with related labels. The graph can be viewed as modeling a semantic image manifold, and it enables the use of graph distances to approximate semantic distances. Thus, we reformulate the task of measuring the semantic distance between two unlabeled pictures as the problem of embedding the two input images in the semantic graph. We propose and evaluate several embedding schemes and graph distance metrics. Our results on Caltech101, Caltech256 and ImageNet show that our distances consistently match or outperform the state-of-the-art in this field.
منابع مشابه
Zero-Shot Learning on Semantic Class Prototype Graph.
Zero-Shot Learning (ZSL) for visual recognition is typically achieved by exploiting a semantic embedding space. In such a space, both seen and unseen class labels as well as image features can be embedded so that the similarity among them can be measured directly. In this work, we consider that the key to effective ZSL is to compute an optimal distance metric in the semantic embedding space. Ex...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملImage feature optimization based on nonlinear dimensionality reduction
Image feature optimization is an important means to deal with high-dimensional image data in image semantic understanding and its applications. We formulate image feature optimization as the establishment of a mapping between highand low-dimensional space via a five-tuple model. Nonlinear dimensionality reduction based on manifold learning provides a feasible way for solving such a problem. We ...
متن کاملSeparating pose and expression in face images: a manifold learning approach
Digital images of a person’s face display a wide range of variations from differing pose, expression, and illumination conditions. Such variations can be modeled empirically by an appearance manifold in the image space. In this paper, we tackle the problem of learning the appearance manifold of faces in an unsupervised way. In particular, we aim to extract the substructure of facial expressions...
متن کاملManifold of Facial Expression
In this paper, we propose the concept of Manifold of Facial Expression based on the observation that images of a subject’s facial expressions define a smooth manifold in the high dimensional image space. Such a manifold representation can provide a unified framework for facial expression analysis. We first apply Active Wavelet Networks (AWN) on the image sequences for facial feature localizatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012